adaptive sequential partition
Density Estimation via Discrepancy Based Adaptive Sequential Partition
Given $iid$ observations from an unknown continuous distribution defined on some domain $\Omega$, we propose a nonparametric method to learn a piecewise constant function to approximate the underlying probability density function. Our density estimate is a piecewise constant function defined on a binary partition of $\Omega$. The key ingredient of the algorithm is to use discrepancy, a concept originates from Quasi Monte Carlo analysis, to control the partition process. The resulting algorithm is simple, efficient, and has provable convergence rate. We demonstrate empirically its efficiency as a density estimation method. We also show how it can be utilized to find good initializations for k-means.
Reviews: Density Estimation via Discrepancy Based Adaptive Sequential Partition
Novelty: there are a fair number of adaptive partitioning density estimation methods in existence that the authors omit (Darbellay and Vajda, 1999; Petralia, Vogelstein, and Dunson, 2013; etc). Some of these are designed for high dimensions, some function in lower dimensions. In general, this space is fairly well studied but there is room for serious theoretical analysis. A much stronger result than the standard Monte Carlo integration rates would be something like rates in Hellinger or Lp distance, which are generally not dimension-free. These rates and associated traits (adaptivity to intrinsic dimension, shape constraints, etc) are generally much more predictive of how a method performs in the wild than simple MC integration rates.
Density Estimation via Discrepancy Based Adaptive Sequential Partition
Li, Dangna, Yang, Kun, Wong, Wing Hung
Given $iid$ observations from an unknown continuous distribution defined on some domain $\Omega$, we propose a nonparametric method to learn a piecewise constant function to approximate the underlying probability density function. Our density estimate is a piecewise constant function defined on a binary partition of $\Omega$. The key ingredient of the algorithm is to use discrepancy, a concept originates from Quasi Monte Carlo analysis, to control the partition process. The resulting algorithm is simple, efficient, and has provable convergence rate. We demonstrate empirically its efficiency as a density estimation method.